perm filename LABEL.XFR[DOC,LMM] blob sn#062951 filedate 1973-09-13 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00005 00002	.<< NOTE THE FOLLOWING CONVENTIONS IN THIS TEXT : >>
C00048 ENDMK
C⊗;
.<< NOTE THE FOLLOWING CONVENTIONS IN THIS TEXT : >>
.<< ?? MEANS QUESTION MARK
.<< ?B MEANS BEGIN BOLD FACE
.<< ?I MEANS BEGIN ITALICS
.<< ?N MEANS RESTORE NORMAL TEXT
.<< ?U MEANS SUPERSCRIPT NEXT CHAR
.<< ?D MEANS SUBSCRIPT NEXT CHAR
,<< ?P MEANS THE CHARACTER PI
.<< ?( IS A LEFT BRACE
.<< ?) IS A RIGHT BRACE
.<< ?> IS GREATER THAN OR EQUAL SIGN
.<< ?< IS LESS THAN OR EQUAL SIGN
.<< ?/ IS A VERTICAL BAR
.<< ?A IS AN AMPERSAND (USED FOR MAKING FRACTIONS
.<< ?L IS LEFT SQUARE BRACKET
.<< ?R IS RIGHT SQUARE BRACKET
. <<  title page >>
.BEGIN CENTER
Applications of Artificial Intelligence for Chemical Inference XIII.
Labelling Objects Having Symmetry<<ACK1:We gratefully acknowledge the
support of this research by the National Institutes of Health (RR
00612-03)
and the Advanced Research Projects Agency (SD-183).
>?U,<<ML:For part XII, see L. Masinter, N. S. Sridharan, J.
Lederberg, and
D. H. Smith, ?IJ. Amer. Chem. Soc.?N, ?B00?N, 0000 (0000).>

Larry Masinter and N.S. Sridharan

Contribution from the Computer Science Department
Stanford University
Stanford, California 94305
.END
.GROUP SKIP 5
.BEGIN INDENT 6,6,6
ABSTRACT. An algorithm for finding a complete set of non-equivalent
labellings of a symmetric object and applications of the algorithm
to problems in chemistry are presented.
.END
.SKIP TO COLUMN 1
The class of combinatorial problems dealing with finding a complete
set of non-isomorphic objects under varying constraints and concepts
of isomorphism, has wide applications in a variety of fields. The
problem attacked in this paper is one of finding all distinct ways
to assign a given collection of labels or colors to the parts of a
symmetric object.
In chemistry, one manifestation of this
problem is to make all assignments of ligands (from a fixed set) to
a given carbocyclic skeleton
.REFERENCE! ML?). Part A of
this paper may be read as a brief tutorial on the nature of the
problem and an introduction to the terminology found in more
technical treatments. Part B is a textual description of a method
for the solution of this type of problem. Part C is a summary of the
procedure in a more algorithmic form; an even more formal
description and a proof of correctness is available elsewhere<<BROWN:
(a) H. Brown, L. Masinter, and L. Hjelemeland, ?IDiscrete Math.?N in
press; (b)
Stanford Computer Science Memo 31B, Computer Science Department,
Stanford University.>.
Part D gives examples and applications of
this algorithm in both organic and inorganic chemistry.

This problem is encountered in a wide range of applications beyond
chemistry-- within many areas of graph theory and combinatorics, for
example.
It has been known how to compute the number of solutions<<DEBRUIJN:
N. G. DeBruijn, ?INedarl. Akad. Wentesh. Proc. Ser. A?N, ?B62?N, 59
(1959).>?U,<<polya:
N. G. DeBruijn, in "Applied Combinatorial Mathematics," E.
Beckenbach, Ed.,
John Wiley,
New York, 1964, p. 144.>,
but an efficent method of actually constructing the solutions has
not previously been published<<PERLMAN:an alternative algorithm has
been
described to us in a private communication from D. M. Perlman,
University
of California, San Diego, California.>.

.TITL PART A: DEFINITIONS

.HD SYMMETRY AND ITS RELATIONSHIP TO LABELLING

Consider the special case of the general problem: suppose all of the
labels are distinct. For example, suppose that one wishes to
number the faces of a cube with the numbers ?(1, 2, 3, 4, 5, 6?), or
the
"nodes" (atom positions in a graph) of the decalin skeleton (?B1?N)
with numbers ?(1, 2, 3, 4, 5, 6, 7, 8, 9, 10?).

.       FIGURE 1,8,The Decalin Skeleton

There are n! (n factorial)
ways of labelling where n is the number of parts.  If the object has
no symmetry then each of these n! ways are distinct from the rest.
However for the decalin skeleton (where n! = 10! = 3,628,800 ways)
there is some symmetry. First one picks, arbitrarily, one of the
numberings as a point of reference (e.g. ?B2a?N).

.       FIGURE 2,10,Three of the 10! numberings of the Decalin
Skeleton.

Some of the 10! ways might be viewed as different (e.g. ?B2b?N); some
of them might be viewed as the same (e.g. ?B2c?N). Intuitively,
?B2a?N
and ?B2c?N are equivalent because one could flip ?B2a?N over the 3-8
axis and get ?B2c?N.  There is another way of determining the
"sameness"
of such numberings which is easier in more complicated cases and is
generally more suited to computer application:

.BEGIN I6

?BDEFINITION:?N Two numberings of the nodes of a graph are
?Iequivalent?N if the connection tables with the respective
numberings
are identical when the node numbers are written in ascending order
and each "connected to" list is in ascending order.

.END

Table I contains the respective "connection tables" of structures
?B2a-c?N.
Note that the connection table for ?B2c?N is identical to that of
?B2a?N;
that of ?B2b?N is different.

.BEGIN NOFILL GROUP
.SELECT 4
.SEPLIN

?BTable I.?N Connection Tables for Structures ?B2a-c?N.

             ?B2a?N             ?B2b?N             ?B2c?N
         node?/connected ?/ node?/connected ?/ node?/connected
             ?/   to     ?/     ?/   to     ?/     ?/  to
                        ?/               ?/
           1    2,1O    ?/  1    8,9      ?/  1    2,1O
           2    1,3     ?/  2    3,7      ?/  2    1,3
           3    2,4,8   ?/  3    2,6      ?/  3    2,4,8
           4    3,5     ?/  4    6,8,1O   ?/  4    3,5
           5    4,6     ?/  5    9,1O     ?/  5    4,6
           6    5,7     ?/  6    3,4      ?/  6    5,7
           7    6,8     ?/  7    2,8      ?/  7    6,8
           8    3,7,9   ?/  8    1,4,7    ?/  8    3,7,9
           9    8,1O    ?/  9    1,5      ?/  9    8,1O
           1O   1,9     ?/  1O   4,5      ?/  1O   1,9
.SEPLIN
.END

This definition means, among other things, that properties such as
valency are preserved: If two numberings are equivalent and in the
first, node 1 is trivalent, then in the second, node 1 is trivalent.
If there are other properties of the nodes (they are already colored
or labelled, for example), then this definition can be extended to
include the preservation of those properties.

For example, suppose there are atom names associated with (some of)
the nodes of the graph.  Then one can define equivalent numberings
to be those which not only have identical connection tables, but
also the same atom names
for the corresponding nodes. Then ?B3a?N
would still be equivalent to ?B3c?N but no longer to ?B3b?N since,
although the connection tables of ?B3a?N and ?B3b?N are identical,
node
1 in ?B3a?N is labelled with an "N" while in ?B3b?N it unlabelled.

.Figure 3,10,Partially labelled graphs reduce the equivalencies.

.HD PERMUTATIONS AND PERMUTATION GROUPS

Given a numbering of a graph, one can use a condensed notation
to write down other numberings in terms of the first. All of these
are written down with respect to an original "reference" numbering.
Using the numbering of ?B2a?N as the reference numbering, numberings
for ?B2a-c?N are given in Table II. In
the 2b case, the row of numbers means that, in sequence, the node
numbered 1 in the reference numbering of ?B2a?N is now numbered 2,
the node
originally numbered 2 is now numbered 10, and so on.

.BEGIN NOFILL GROUP SELECT 4
.SEPLIN

?BTable II?N  Condensed Notation for Numberings of ?B2a-c?N

          ?B2a?N numbers:  1  2  3  4  5  6  7  8  9 1O

          ?B2b?N numbers:  2  7  8  1  9  5 1O  4  6  3

          ?B2c?N numbers:  5  4  3  2  1 1O  9  8  7  6
.SEPLIN
.END

One can conceptualize a numbering as a transformation or as a
function: The transformation ?P for ?B2c?N is ?P?D2?Dc(1)=5,
?P?D2?Dc(2)=4,
?P?D2?Dc(3)=3, ... ?P?D2?Dc(10)=6.  These transformations are
?Ipermutations?N: one to one mappings from the integers ?(1,2,...,n?)
to
themselves.  The transformation for the "reference" numbering is the
identity ?P?Di(x)=x. It can be shown that whatever the graph, the set
of
transformations satisfying the "equivalency" requirement above
satisfies the property of a group.  One may then say:

.BEGIN I6

The ?Isymmetry group?N of a graph is the set of all transformations
which
yield identical connection tables.  (If there are other properties
to be considered, one may include them as part of the connection
table).

.END; CONTINUE

For the decalin skeleton there are 4 permutations in the symmetry
group,
given in Table III.

.BEGIN NOFILL GROUP SELECT 4
.SEPLIN

?BTable III.?N The Symmetry Group of the Decalin Skeleton


                 ?P?Di?Ua    1  2  3  4  5  6  7  8  9 1O
                 ?P?Dv     5  4  3  2  1 1O  9  8  7  6
                 ?P?Dh    1O  9  8  7  6  5  4  3  2  1
                 ?P?D?L18O?R   6  7  8  9 1O  1  2  3  4  5

?Ua The reference numbering corresponds to that given for ?B2a?N.
.SEPLIN
.END

These correspond directly to the intuitive geometric symmetries
?P?Di=identity, ?P?Dh=reflection about horizontal axis,
?P?Dv=reflection
about vertical axis, ?P?D?L180?R = rotation by 180 degrees.
It is not too difficult for a computer program to figure out the
symmetry
group of a graph given its connection table.

This definition deals with the symmetry of the ?Inodes?N of a graph.
In many cases, one might wish to label the ?Iedges?N (interconnecting
lines) of a graph.  In this case, the symmetry group on the edges
rather than on the nodes is required.   It is very easy to calculate
this group from the group on the nodes. Let the numbering for each
edge
in the graph be the unordered pair of numbers of the nodes that form
the end-points.
Then the corresponding permutations can be obtained as follows:

.BEGIN NOFILL

            ?P?Di?(n?D1,n?D2?)=?(?P?Di(n?D1),?P?Di(n?D2)?)
.END CONTINUE

This is most easily shown by way of an example (Table IV).

.BEGIN NOFILL GROUP SELECT 4
.SEPLIN

?BTable IV.?N Permutation Group of Decalin Skeleton Acting on the
edges

  ?P?Di?Ua    1-2   2-3  3-8  3-4  4-5  5-6  6-7  7-8  8-9  9-10 1-10

  ?P?Dv     4-5   3-4  3-8  2-3  1-2  1-10 9-10 8-9  7-8  6-7  5-6

  ?P?Dh     9-10  8-9  3-8  7-8  6-7  5-6  4-5  3-4  2-3  1-2  1-10

  ?P?D?L180?R   6-7   7-8  3-8  8-9  9-10 1-10 1-2  2-3  3-4  4-5  5-6

?UaThe reference numbering corresponds to that given for ?B2a?N.
.SEPLIN
.END

Finding the group of an object is a special kind of labelling
problem.
Given one way of numbering (labelling with distinct labels) the parts
of the object, one finds all other ways which are equivalent.  The
labelling
problem attacked in this paper is the converse:  to find a maximal
list of labellings, none of which are equivalent.  In
general the set of all posssible labellings can be split into
subsets,
such that:

.BEGIN I6
1) If two labellings are in different subsets, they are not
equivalent.

2)  If two labellings are in the same subset, they are equivalent.
.END

These subsets are called ?Iequivalence classes?N of labellings;
selection
of one labelling from each equivalence class yields the maximal list
of non-equivalent labellings.

The relationship of finding the group, and of finding labellings
when all the labels are distinct, can be seen as follows (see ?B4?N):
view the n! possible labellings as laid out in a matrix such that
each
row contains one equivalence class. It can be shown that in the
special case where the labels are distinct, all of the equivalence
classes
are of the same size.  To find the group, one needs to find a row.
To find the labellings, one needs to pick one element from each row
(i.e. find a column).

.Figure 4,15,"Equivalence classes, Groups, and Labellings"

.TITL PART C: SUMMARY OF LABELLING STEPS

.BEGIN SPREAD←1; PREFACE 1; INDENT 0,3,3

1) Calculate the group, if not previously calculated.

2) If there are more than two different node types, do the entire
labelling process with one of the label types, and the rest "blanks";
then for each of the solutions obtained, label the "blanks" with the
remaining label types using the reduced symmetry group and collect
the results.

3) Otherwise, (only two label types),

.BEGIN INDENT 3,6,6

a) Find the orbits.

b) If more than one orbit, then
.BEGIN INDENT 6,9,9

1) Find the different "cases" -- ways of distributing the labels
among the orbits.

2) For each case,

.BEGIN INDENT 9,12,12

a) Label the first orbit.

b) Label the rest of the orbits using the reduced symmetry group
obtained from a), and collect the results.

.END END

c) Otherwise (only 1 orbit and 2 label types):

.BEGIN INDENT 6,9,9
1) If there is only one either label type, pick the "first"
node and label it with that label type.  This is the only distinct
possibility.

2) Otherwise if there are exactly two of either label type, label the
first node with that label type, calculate the reduced symmetry
group and new orbits, and from each of the new orbits, pick the
"first" node.  The solutions consist of those possibilities (one for
each new orbit).

3) Otherwise, (3 or more of each label type, and one orbit) label
the "first" node, calculate the reduced symmetry group, label the
rest of the nodes under the reduced group, and for each of the
results, check if for every permutation ?P in the original group that

.BEGIN NOFILL
                ?P(labelled set)?>labelled set
.END

.CONTINUE
If this is not true of a labelled set, discard it
as a solution.   The result is every such labelling
that satisfies this property.
.END END END

.TITL PART D:  EXTENDED EXAMPLES

.HD ?/Study of Diels-Alder Rings?N<<SIMEK:proposed by Jan Simek,
.Chemistry Department, Stanford University
.Research Proposal (unpublished), February, 1973.>?/

The labelling algorithm has been used to define the scope and
boundaries of the Diels-Alder reaction, a well-known and commonly
used synthetic reaction.  The reaction is shown in ?B11?N,
and is
defined as the 4+2 cycloaddition of an olefin, termed the
dienophile, with a conjugated diene, leading to the formation of a
cyclohexene-type of ring system (Diels-Alder Ring).

.FIGURE 11,5,Diels-Alder reaction

The method used by the program to generate Diels Alder Rings is
described below, followed by an example of the labelling procedure.
The program generated 1176 Diels-Alder Rings using any combination
of C, N, O and S.  Other atoms such as P could have been included but
were deemed not interesting to us.  A comparison of the computer
print-out with the Ring Index (which covers the literature through
1963) revealed that only 224 (about 19% of 1176) are "known" systems.
A ring system was said to be known if the same sequence of atoms had
been reported regardless of number or position of unsaturations.  The
complete list of Diels-Alder Rings is richly suggestive to the
synthetic chemists and may serve to increase the information on the
scope of the Diels-Alder reaction.

.Figure 12,20,Diagram of Method of Generation

.HD ?/METHOD  ?N(see ?B12?N)?/

.graf Step 1
The initial pot of atoms consisted of C?D6N?D6O?D4S?D4.  The number
of oxygens and sulfurs was limited to four, because no Diels-Alder
ring can be made with five or six bivalent atoms, owing to valence
restrictions.  A list of all possible 76 compositions of 6 atoms was
produced using a purely combinatorial procedure.

.graf Step 2
Eleven of the 76 compositions were eliminated, again owing
to valence limitations.  An example of the eleven compositions
eliminated is O?D3S?D3.

.graf Step 3
For each of the 65 remaining compositions, the Diels-Alder
ring skeleton was labelled with the atoms, while ensuring valence
constraints were satisfied.

.graf Step 4
The results from all labelling steps were collected, without
needing to check for duplicates.  The labelling algorithm guarantees
that the lists were irredundant.

.graf Example of labelling
An example of a valid composition is C?D4O?D2.
Diels-Alder rings formed with this composition can only have carbons
double bonded to each other (see ?B13?N).  The atoms remaining to be
assigned are C?D2O?D2 and the ring positions are numbered 1,2,3,4.

.FIGURE 13,16,Diels-Alder Example A

.HD Verification by Enumeration

The results of of the labelling procedure can be verified by
combinatorial counting techniques
.REFERENCE! DEBRUIJN?)?U,
.REFERENCE! POLYA?).
The following derivation follows closely from that in
Liu<<LIU:?N?NLiu, Introduction to Comb. Math, McGraw-Hill, 1968>.
The generating function<<genfn:generate a reference for generating
functions> of the number of labellings with two types of labels can
be calculated with the Cycle Index of Group=PG =
.BEGIN NOFILL SELECT 4
         1/2(y?U4?A?D1+ y?U2&?D1)
.END CONTINUE
and the Pattern Inventory is
.BEGIN NOFILL SELECT 4
 1/2 (x?U1?A?D1+ x?U1?A?D2)?U4+ (x?U2?A?D1+ x?U2?A?D2)?U2
.END CONTINUE
showing that there are precisely these number of labellings:
.BEGIN NOFILL SELECT 4
        labels         #labellings

       C?D4                 1
       S?D4                 1
       C?D2S?D2               4
       CS?D3                2
       C?D3S                2

.end
The third entry in the table verifies our labelling with C?D2S?D2 in
four
ways.

.HD Labellings on the Octahedral Skeletal Framework.

This example is concerned with the geometric
isomers of structures with octahedral symmetry (see ?B14?N).
.FIGURE 14,10,Octahedral Skeletal Framework
.begin nofill
.SEPLIN

            Labellings on Octahedral Skeletal Framework
Group is:

          i      (1 2 3 4 5 6)
          r?D1    (1 3 5 4 6 2)
          r?D2    (1 5 6 4 2 3)
          r?D3    (1 6 2 4 3 5)
          v?D1    (4 5 3 1 2 6)
          v?D2    (4 2 6 1 5 3)
          v?D3    (4 3 2 1 6 5)
          v?D4    (4 6 5 1 3 2)

Orbits:  ?(1,4?), ?(2,3,5,6?).

Example with labels (A A A A B B)
   Number of labels A = 4
   Number of labels B = 2


Partitioning Labels into Orbits:

.BEGIN GROUP
   Case   number of A's assigned to orbit:
    #     ?(1 4?)     ?(2 3 5 6?)

    1       2           0
    2       1           1
    3       0           2
.END

Case 1.  To map A A --> ?(1,4?)
            and A A B B --> ?(2,3,5,6?)
   Map A A --> ?(1,4?) (trivial case)
   New group   i, r?D1, r?D2, r?D3, v?D1, v?D2, v?D3, v?D4
   New orbits ?(2 3 5 6?)
   To map A A B B --> ?(2 3 5 6?)

   Case 1.1.  Assign A --> 2
      New group i,r?D2
      New orbits ?(5?),?(3,6?)

      Case 1.1.1.  Assign A --> 5
         Remaining B B --> ?(3,6?)
                                1 labelling (A A B A A B)
      Case 1.1.2.  Assign A --> 3
         Remaining B B --> ?(5,6?)
                                1 labelling (A A A A B B)

Case 2.  To map A B --> ?(1,4?)
            and A A A B --> ?(2,3,5,6?)
         Assign A --> 1 and B --> 4
         New group i, r?D1, r?D2, r?D3
         New orbits ?(2 3 5 6?)
         To map A A A B --> ?(2 3 5 6?)

  Case 2.1. Assign B --> 2
         To map A A A --> ?(3 5 6?)
                                 1 labelling (A B A B A A)

Case 3.  To map B B --> ?(1 4?)
            and A A A A --> ?(2 3 5 6?)
         Assign B --> and B --> 4
         New group  i, r?D1, r?D2, r?D3, v?D1, v?D2, v?D3, v?D4
         New orbit ?(2 3 5 6?)
         To map A A A A --> ?(2 3 5 6?)
                              1 labelling (B A A B A A)


.FIGURE 15,5,?/Four labellings of Octahedral Skeleton with Labels (A
A A A B B)?/


Example with Labels  A B C D E F
Case 1.  A --> orbit ?(1 4?)
         A --> 1
   new group  i, r?D1, r?D2, r?D3
      new orbits A1 = ?(2 3 5 6?)
                 A2 = ?(4?)
   Assign label B
   Case 1.1. B --> orbit A1
            B --> 2
      new group i; special case No Group

   Case 1.2. B --> orbit A2
            B --> 4
      new group  i, r1, r2, r3
      new orbits AA1 = ?(2 3 5 6?)
      Case 1.2.1  C --> orbit AB1
                C --> 2
         new group i; special case No group

Case 2. A --> orbit ?(2 3 5 6?)
        A --> 2
   new group i,                 ??????
   new orbits B1 = ?(1 4?)
              B2 = ?(5?)
              B3 = ?(3 6?)

   Case 2.1. B --> orbit B1
            B --> 1
      new group i; special case No Group; 24 labellings

   Case 2.2. B --> orbit B2
            B --> 5
      new group i,??                            ???????? fill in
      new orbits BB1 = ?(1 4?)
                 BB2 = ?(3 6?)

      Case 2.2.1. C --> orbit BB1
                C --> 1
         new group i; special case No Group; 6 labellings

      Case 2.2.2. C --> orbit BB2
                C --> 3
         new group i ; special case No group ; 6 labellings

   Case 2.3. B --> B3
            B --> 3
      new group i ; special case No Group; 24 labellings

90 unique labelings all together.

Verification:  When all labels are different the total number of
labellings
=?U?L(# labels)!?R?A?D?L(size of group)?R?A[--------------]
  = ?U?L6!?R?A?D8?A?L---?R=90 labellings

.end

.TITL ACKNOWLEDGEMENTS

We gratefully acknowledge the contributions of
D. H. Smith, whose aid in the preparation of this paper was
invaluable,
Jan Simek, who proposed the application in Part D,
Larry Hjelmeland who formulized the initial problem,
Professor Harold Brown, who proved the correctness of the original
algorithms, and
Professor Joshua Lederburg, whose advice and guidance has always
been an inspiration.